01. Debugging is Hard
L3 03 01 Why Debugging In Spark Is Hard V2
Debugging Spark is harder on Standalone
Debugging Spark is harder on Standalone mode
- Previously, we ran Spark codes in the local mode where you can easily fix the code on your laptop because you can view the error in your code on your local machine.
- For Standalone mode, the cluster (group of manager and executor) load data, distribute the tasks among them and the executor executes the code. The result is either a successful output or a log of the errors. The logs are captured in a separate machine than the executor, which makes it important to interpret the syntax of the logs - this can get tricky.
- One other thing that makes the standalone mode difficult to deploy the code is that your laptop environment will be completely different than AWS EMR or other cloud systems. As a result, you will always have to test your code rigorously on different environment settings to make sure the code works.